Search CORE

1,033 research outputs found

A Comparison of Methods for Data-Driven Cancer Outlier Discovery, and An Application Scheme to Semisupervised Predictive Biomarker Discovery

Author: Karrila Seppo
Lee Julian Hock Ean
Tucker-Kellogg Greg
Publication venue: Libertas Academica
Publication date: 01/01/2011
Field of study

A core component in translational cancer research is biomarker discovery using gene expression profiling for clinical tumors. This is often based on cell line experiments; one population is sampled for inference in another. We disclose a semisupervised workflow focusing on binary (switch-like, bimodal) informative genes that are likely cancer relevant, to mitigate this non-statistical problem. Outlier detection is a key enabling technology of the workflow, and aids in identifying the focus genes

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGH

Author: Greg Tucker-Kellogg
Oscar M Rueda
Ramón Díaz-Uriarte
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

Genomic DNA copy-number alterations (CNAs) are associated with complex diseases, including cancer: CNAs are indeed related to tumoral grade, metastasis, and patient survival. CNAs discovered from array-based comparative genomic hybridization (aCGH) data have been instrumental in identifying disease-related genes and potential therapeutic targets. To be immediately useful in both clinical and basic research scenarios, aCGH data analysis requires accurate methods that do not impose unrealistic biological assumptions and that provide direct answers to the key question, “What is the probability that this gene/region has CNAs?” Current approaches fail, however, to meet these requirements. Here, we introduce reversible jump aCGH (RJaCGH), a new method for identifying CNAs from aCGH; we use a nonhomogeneous hidden Markov model fitted via reversible jump Markov chain Monte Carlo; and we incorporate model uncertainty through Bayesian model averaging. RJaCGH provides an estimate of the probability that a gene/region has CNAs while incorporating interprobe distance and the capability to analyze data on a chromosome or genome-wide basis. RJaCGH outperforms alternative methods, and the performance difference is even larger with noisy data and highly variable interprobe distance, both commonly found features in aCGH data. Furthermore, our probabilistic method allows us to identify minimal common regions of CNAs among samples and can be extended to incorporate expression data. In summary, we provide a rigorous statistical framework for locating genes and chromosomal regions with CNAs with potential applications to cancer and other complex human diseases

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Flexible and Accurate Detection of Genomic Copy-Number Changes from aCGH

Author: Greg Tucker-Kellogg
Oscar M Rueda
Ramón Díaz-Uriarte
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Direct Inference of SNP Heterozygosity Rates and Resolution of LOH Detection

Author: Brian J Reid
Greg Tucker-Kellogg
Patricia C Galipeau
Steven G Self
Thomas G Paulson
Xiaohong Li
Publication venue: Public Library of Science
Publication date: 01/11/2007
Field of study

Single nucleotide polymorphisms (SNPs) have been increasingly utilized to investigate somatic genetic abnormalities in premalignancy and cancer. LOH is a common alteration observed during cancer development, and SNP assays have been used to identify LOH at specific chromosomal regions. The design of such studies requires consideration of the resolution for detecting LOH throughout the genome and identification of the number and location of SNPs required to detect genetic alterations in specific genomic regions. Our study evaluated SNP distribution patterns and used probability models, Monte Carlo simulation, and real human subject genotype data to investigate the relationships between the number of SNPs, SNP HET rates, and the sensitivity (resolution) for detecting LOH. We report that variances of SNP heterozygosity rate in dbSNP are high for a large proportion of SNPs. Two statistical methods proposed for directly inferring SNP heterozygosity rates require much smaller sample sizes (intermediate sizes) and are feasible for practical use in SNP selection or verification. Using HapMap data, we showed that a region of LOH greater than 200 kb can be reliably detected, with losses smaller than 50 kb having a substantially lower detection probability when using all SNPs currently in the HapMap database. Higher densities of SNPs may exist in certain local chromosomal regions that provide some opportunities for reliably detecting LOH of segment sizes smaller than 50 kb. These results suggest that the interpretation of the results from genome-wide scans for LOH using commercial arrays need to consider the relationships among inter-SNP distance, detection probability, and sample size for a specific study. New experimental designs for LOH studies would also benefit from considering the power of detection and sample sizes required to accomplish the proposed aims

Crossref

Directory of Open Access Journals

PubMed Central

Recommended from our members

Evaluation of Normalization Procedures for Oligonucleotide Array Data Based On Spiked cRNA Controls

Author: Brown Eugene L
Hill Andrew A.
Hunter Craig P.
Slonim Donna K
Tucker-Kellogg Greg
Whitley Maryann Z
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/10/2010
Field of study

Background: Affymetrix oligonucleotide arrays simultaneously measure the abundances of thousands of mRNAs in biological samples. Comparability of array results is necessary for the creation of large-scale gene expression databases. The standard strategy for normalizing oligonucleotide array readouts has practical drawbacks. We describe alternative normalization procedures for oligonucleotide arrays based on a common pool of known biotin-labeled cRNAs spiked into each hybridization. Results: We first explore the conditions for validity of the 'constant mean assumption', the key assumption underlying current normalization methods. We introduce 'frequency normalization', a 'spike-in'-based normalization method which estimates array sensitivity, reduces background noise and allows comparison between array designs. This approach does not rely on the constant mean assumption and so can be effective in conditions where standard procedures fail. We also define 'scaled frequency', a hybrid normalization method relying on both spiked transcripts and the constant mean assumption while maintaining all other advantages of frequency normalization. We compare these two procedures to a standard global normalization method using experimental data. We also use simulated data to estimate accuracy and investigate the effects of noise. We find that scaled frequency is as reproducible and accurate as global normalization while offering several practical advantages. Conclusions: Scaled frequency quantitation is a convenient, reproducible technique that performs as well as global normalization on serial experiments with the same array design, while offering several additional features. Specifically, the scaled-frequency method enables the comparison of expression measurements across different array designs, yields estimates of absolute message abundance in cRNA and determines the sensitivity of individual arrays.Molecular and Cellular Biolog

Harvard University - DASH

Genetic progression and the waiting time to cancer

Author: Arne Traulsen
Bert Vogelstein
David Dingli
Greg Tucker-Kellogg
Kenneth W Kinzler
Martin A Nowak
Niko Beerenwinkel
Tibor Antal
Victor E Velculescu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

Cancer results from genetic alterations that disturb the normal cooperative behavior of cells. Recent high-throughput genomic studies of cancer cells have shown that the mutational landscape of cancer is complex and that individual cancers may evolve through mutations in as many as 20 different cancer-associated genes. We use data published by Sjoblom et al. (2006) to develop a new mathematical model for the somatic evolution of colorectal cancers. We employ the Wright-Fisher process for exploring the basic parameters of this evolutionary process and derive an analytical approximation for the expected waiting time to the cancer phenotype. Our results highlight the relative importance of selection over both the size of the cell population at risk and the mutation rate. The model predicts that the observed genetic diversity of cancer genomes can arise under a normal mutation rate if the average selective advantage per mutation is on the order of 1%. Increased mutation rates due to genetic instability would allow even smaller selective advantages during tumorigenesis. The complexity of cancer progression thus can be understood as the result of multiple sequential mutations, each of which has a relatively small but positive effect on net cell growth.Comment: Details available as supplementary material at http://www.people.fas.harvard.edu/~antal/publications.htm

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Repository for Publications and Research Data

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Search algorithms as a framework for the optimization of drug combinations

Author: A Wagner
AJ Viterbi
AL Barabasi
Andrew D. McCulloch
B Schmidt
CM Reidys
D Calzolari
Diego Calzolari
E Lin
EK Kemsley
F Jelinek
G Paternostro
GA Bekey
Giovanni Paternostro
GR Zimmermann
Greg Tucker-Kellogg
J Bechhoefer
J Lamb
JA Radford
Jacob D. Feala
JB Fitzgerald
JD Feala
Jennifer Schofield
JG Hardman
JG Wood
JJ Schneider
JM Toivonen
John C. Reed
JW Gargano
K Wang
Laurence Coquin
M Bohm
M Nerenberg
PG Gobbi
PK Wong
R Johannesson
R Marcus
R Palmer
R Pfeifer
RA Kloner
RA Weinberg
RP Araujo
Stefania Bruschi
V Diehl
WR Greco
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 13/10/2008
Field of study

Combination therapies are often needed for effective clinical outcomes in the management of complex diseases, but presently they are generally based on empirical clinical experience. Here we suggest a novel application of search algorithms, originally developed for digital communication, modified to optimize combinations of therapeutic interventions. In biological experiments measuring the restoration of the decline with age in heart function and exercise capacity in Drosophila melanogaster, we found that search algorithms correctly identified optimal combinations of four drugs with only one third of the tests performed in a fully factorial search. In experiments identifying combinations of three doses of up to six drugs for selective killing of human cancer cells, search algorithms resulted in a highly significant enrichment of selective combinations compared with random searches. In simulations using a network model of cell death, we found that the search algorithms identified the optimal combinations of 6-9 interventions in 80-90% of tests, compared with 15-30% for an equivalent random search. These findings suggest that modified search algorithms from information theory have the potential to enhance the discovery of novel therapeutic drug combinations. This report also helps to frame a biomedical problem that will benefit from an interdisciplinary effort and suggests a general strategy for its solution.Comment: 36 pages, 10 figures, revised versio

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Translating Clinical Findings into Knowledge in Drug Safety Evaluation - Drug Induced Liver Injury Prediction System (DILIps)

Author: A Bender
A Kindmark
AK Daly
AK Daly
AK Daly
AL Hopkins
AM Cohen
B Yan
C Knox
D Fourches
Don Ding
DS Wishart
DS Wishart
GP Aithal
Greg Tucker-Kellogg
H Hosomi
H Olson
Hong Fang
J O'Donohue
J Thibaudeau
JB Singer
JC Nacher
JH Nettles
JHJ Xu
K Hirata
LE Jensen
M Chen
M Kuhn
MA Hamburg
MA Yildirim
N Anderson
N Greene
N Kaplowitz
Nidhi
PJ O'Brien
Q Shi
Qiang Shi
R McKenzie
RA Wilke
RC Glen
Reagan Kelly
RJ Andrade
S Ekins
S Russmann
SK Sharma
TJ Crisman
UA Boelsterli
VJ Navarro
Weida Tong
WL Morison
WM Lee
XM Deng
XYN Xu
Y Duguay
Zhichao Liu
Publication venue: Public Library of Science
Publication date: 01/12/2011
Field of study

Drug-induced liver injury (DILI) is a significant concern in drug development due to the poor concordance between preclinical and clinical findings of liver toxicity. We hypothesized that the DILI types (hepatotoxic side effects) seen in the clinic can be translated into the development of predictive in silico models for use in the drug discovery phase. We identified 13 hepatotoxic side effects with high accuracy for classifying marketed drugs for their DILI potential. We then developed in silico predictive models for each of these 13 side effects, which were further combined to construct a DILI prediction system (DILIps). The DILIps yielded 60–70% prediction accuracy for three independent validation sets. To enhance the confidence for identification of drugs that cause severe DILI in humans, the “Rule of Three” was developed in DILIps by using a consensus strategy based on 13 models. This gave high positive predictive value (91%) when applied to an external dataset containing 206 drugs from three independent literature datasets. Using the DILIps, we screened all the drugs in DrugBank and investigated their DILI potential in terms of protein targets and therapeutic categories through network modeling. We demonstrated that two therapeutic categories, anti-infectives for systemic use and musculoskeletal system drugs, were enriched for DILI, which is consistent with current knowledge. We also identified protein targets and pathways that are related to drugs that cause DILI by using pathway analysis and co-occurrence text mining. While marketed drugs were the focus of this study, the DILIps has a potential as an evaluation tool to screen and prioritize new drug candidates or chemicals, such as environmental chemicals, to avoid those that might cause liver toxicity. We expect that the methodology can be also applied to other drug safety endpoints, such as renal or cardiovascular toxicity

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central